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Abstract 

In this work we developed a novel and fast traffic sign 
recognition system, a very important part for advanced 
driver assistance system and for autonomous driving. 
Traffic signs play a very vital role in safe driving and 
avoiding accident. We have used image processing and 
topic discovery model pLSA to tackle this challenging 
multiclass classification problem. Our algorithm is consist 
of two parts, shape classification and sign classification 
for improved accuracy. For processing and representation 
of image we have used bag of features model with SIFT 
local descriptor. Where a visual vocabulary of size 300 
words are formed using k-means codebook formation 
algorithm. We exploited the concept that every image is a 
collection of visual topics and images having same topics 
will belong to same category. Our algorithm is tested 
on German traffic sign recognition benchmark (GTSRB) 
and gives very promising result near to existing state of 
the art techniques. 
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1. Introduction 

Traffic sign recognition system play a important role in 
autonomous driving environment and for advanced driver 
assistance systems. For driving safely and avoiding accident 
in daily fifes traffic signs are very important. A smart system 
that can help human driver by automatic recognition and 
classification and giving warning will ease driver task. With 
the advancement of modem vehicles design and safety this 
have got a considerable attention. For a human driver because 
of fatigue, divergence of attention, occlusion of sign due 
to road obstruction and natural scenes, related to different 
problems may lead to miss some important traffic sign, which 
may result in severe accident. Also automatic recognition 
system will help in proper navigation and to follow traffic 
mles. There are several challenges involved in developing 
a complete traffic sign recognition system, because of oc¬ 
clusion of signs due to different background, specifically 
trees, building etc. old damaged signs, weather condition, 
viewpoint variation, also illuminant changes, in day and 
night etc. A complete traffic sign system consist of detection 
and classification a wide variety of signs. Classification of 
different class signs having same shape is always a difficult 
part, since there are very small difference in between lots of 
traffic signs. 
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Fig. 1. Main Concept 


In this work we will present a novel classification tech¬ 
niques based on probabilistic latent semantic analysis. Also 
we have built a shape classification system based of pyramid 
of HOG features and template matching. For feature repre¬ 
sentation of final pLSA system well known SIFT is used. 

Previously different method which have got state of art 
result on GTSRB and GTSDB are very complex to train 
and need supervision. Method related to convolutional neural 
network, neural network, support vector machine are compu¬ 
tationally complex and need effort to implement in system 
like FPGA device or other low computationally powerful 
devices. In our method related to pLSA is computationally 
fiexible than previous method, and it is an unsupervised 
method and use KNN classifier at final step. We will show 
our result on publicly available German traffic sign (GTSRB) 
database. We have an accuracy of around 96.86-100% over 
sub category. Rest of the paper is organised as follows, 
in section 2 we have reviewed existence literature in this 
area, section 3 describes our main algorithm and section 4 
gives detail about dataset and experimental result along with 
comparison with other existing method. 

IT Related Work 

Considerable research work exist on detection and classifi¬ 
cation traffic signs for addressing the challenged involved in 
real life problems. Even though it is not possible go through 
all this research works in this paper we will give brief 
overview of some relevant works. Most of the work used 
computer vision and machine learning based techniques by 
using data from several camera sensors mounted on car roof 
at different angles. In some of the work researchers explores 
detection based on colour features, such as converting the 
colour space from RGB to HSV etc. and then using colour 
thresholding method for detection and classification by well- 
known support vector machines. In colour thresholding ap- 
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Fig. 2. Sign category and its shapes 


proach morphological operation like connected component 
analysis was done for accurate location. Colour, shape, 
motion information and haar wavelet based features was used 
in this work 112]. By using SVM based colour classification 
on a block of pixels Le et all 113] addressed the problems of 
weather variation. Features like, sift, hog and haar wavelet 
etc. was used by some of this work. In German Traffic Sign 
Recognition Benchmark (GTSRB) competition, top perform¬ 
ing algorithm exceeds best human classification accuracy. By 
using committee of neural networks 13] achieved highest ever 
performance of 99.46%, where best human performance was 
98.84%. Multiscale convolutional network 115] has achieved 
98.31% accuracy in this dataset. Also other algorithm based 
on k-d trees and random forest 116] and LDA on HOGl 12] 
have got very high accuracy. For recognition purpose group 
sparse coding was used by 114] for rich feature learning of 
traffic signs recognition. 


III. Model for Classification 

The aim of the works is to develop a topic based 
classification frameworks for fast and reliable traffic sign 
categorization. Discovering hidden topics in images and 
correlating them with similar topics images form a successful 
classification algorithm. Since each of traffic sign category 
can be assumed very well as combination of one or more 
topics, so we have choose this method over other learning 
based method. For classification of the images we have used 
two step method, in first processing we will classify the shape 
of the traffic signs and after that we will classify its actual 
class. Image may have undergone through different type of 
rotational effect due to viewpoint variation, alignment. In 
Fig. 11] main idea concept of our method is depicted. As a 
prepossessing task we will use affine invariant transform on 
the images for getting rid of rotational effect. 


A. Shape Classification 

In this step traffic sign we will be divided into six class, 
specifically tringle, square, circle, single-circle, rectangle and 
hexagon as shown in Fig. 12]. Dividing the images in terms 
of shape help in classification since traffic sign topic most 
of the time depends on their shape. Different shape class 
images associated with different levels of dangers in road. 
Getting correct shape of the images is a boost for further 
classification. For shape representation HOG 17] is very 
successful. HOG accurately capture structures of an image, 
in low resolution of an image it capture high level features 
leaving fine grained pixels. This will help us representing 
the images as different skull shape and getting rid of its 
inside content. We have formed first five hog template for 
reference of five shapes. For classification a pyramid of hog 
feature at different resolution of the image is developed. Then 
a template matching based method is used for classification 
of its shape to a specific shape from the templates. This is 
based on normalization correlation value. 

dist{T,Im) = E[,j]es(r(/, 7 ) -/m(/, 7 ))^ (1) 

B. Sign Classification 

In this step we have image of known shape, for classifying 
their actual sign on the basis of hidden topic, a probabilistic 
Latent Semantic Analysis based classification techniques will 
be used. Since image is a very high dimensional data, we 
will prepossess it to reduce its dimensionality by using visual 
codebook formation method. Each image will be represented 
as a bag of visual words. Each visual words will be relevant 
information stored in those images. 

1) Feature Extraction: For extracting meaningful infor¬ 
mation from images we will use SIFT descriptor. For object 
recognition SIFT is a very popular local descriptor having 
dimension of 128. This descriptor is scale invariant and suit¬ 
able for object recognition and key-point matching purpose. 
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Fig. 5. pLSA concept 


SIFT computation based on construction of scale space. It 
uses difference of Gaussian filter for locating potential comer 
points. Actual implementation used 16x16 image patch, and 
further divided into 4x4 descriptors, resulting a 4x4x16 = 
128 dimensional vector. 

2 ) LLC Codehook Formation: For the formation of visual 

words locality constrained linear coding [18] based method 
is used. This algorithm generate similar codes for similar 
descriptors by sharing bases. Locality also leads to sparsity. 
Idea based locality importance more than sparsity is used 
and given by below optimization problem. Here X is a D 
dimensional local descriptors, X = [xi,V 2 , G and 

is M dimensional codebook. 

minjli= i^N\\xi-Be i\\^ + X\\diQ)Ci\^ ( 2 ) 

s.?.Fc, = l,Vi (3) 

= (4) 

G 

where 0 denotes the element-wise multiplication, and di G 
is the locality adaptor that gives different freedom 
for each basis vector proportional to its similarity to the 
input descriptor^;. Also dist{xi,B) is the Euclidean distance 
between Xf and B. Shift invariance nature is confirmed by 
the constraint equation (3). In our case we will have 128 
dimension SIFT descriptors xi for each image. And we will 
form a codebook B of size 300 words. Each image will 
be expressed as a combination of these words. Each image 
will be represented as histogram of visual words. In Fig. [4] 
codebook formation idea is depicted. 

3) pLSA model: Probabilistic latent semantic analysis [7] 
is a topic discovery model, its concept based on latent 
variable analysis. An image also can be considered as a 
collection of topics. Every image can be considered as a text 
document with words from a specific vocabulary. The vocab¬ 
ulary will be formed by using k-means algorithm on SIFT 
features from training image. Suppose we have a collection 
of N images(document) D = Ji, J 2 , • • •, and corresponding 
vocabulary with size N1 is IT = wi, W 2 ,..., w^i . And let there 


be N2 topic Z = zi,Z 2 , •••,Za^ 2 Model parameter are computed 
using expectation maximization method. In Fig. [3] and Fig. 
[5] pLSA algorithm idea and typical scenario is explained. 
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( 8 ) 


R = I,d^„n{d,w) 


(9) 


By using histogram of words concept, each image will be 
converted to a document with previously designed vocabu¬ 
lary. 

4) Fuzzy-KNN classification: Since our framework based 
on topic modelling and discovery, fuzzy KNN classification 
[19] techniques is used for getting appropriate topic af a test 
image corresponding to training images. Fuzzy kNN perform 
better than traditional KNN algorithm, because it depends on 
weight of neighbours. In the training stage we will calculate 
P{w\z), which will be used as input for testing algorithm 
to compute P{z\dtest)- After that a K- nearest neighbour 
algorithm is used to classify these image by using probability 
distribution P{z\dtrain)- 
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Ui is value for membership strength to be computed, and Uij 
is previously labelled value of i th class for j th vector. Final 
class label of the query point x is computed as follows: 


uo{x) = argmaxi{ui{x)) 


( 11 ) 

















































Fig. 3. pLSA algorithm idea 


Algorithm 

Speed Limits 

Prohibitions 

Derestrictions 

Mandatory 

Danger 

Unique 

Committee of CNNs[3] 

99.47 

99.93 

99.72 

99.89 

99.07 

99.22 

Human 

98.32 

99.87 

98.89 

100.00 

99.21 

100.00 

Multi-Scale CNN[15] 

98.61 

99.87 

94.44 

97.18 

98.3 

98.63 

pLSA 

98.82 

98.27 

97.93 

96.86 

96.95 

100.00 

Random Forest [16] 

95.95 

99.13 

87.50 

99.27 

92.08 

98.73 

LDA on HOG2 [2] 

95.37 

96.80 

85.83 

97.18 

93.73 

98.63 


TABLE I 


IV. Experiment and Result 

For testing accuracy of our algorithm we have used 
well known, German Traffic Sign Recognition Benchmark 
(GTSRB) 117], having 12000 training image of different 
class, it has 43 different class of signs. This dataset contain 
wide variety of image with the consideration of different 
challenge may involve in recognition. We have also used 
Belgium traffic sign recognition dataset 15] and used extra 
19 classes of sign which were not included in GTSRB. 
This classes mainly related to different warning. Different 
deformation due to viewpoint variation, occlusion due to 
different obstacles like trees, building etc., natural degrading, 
different weather condition are considered in these dataset. 
Also for training the system we have used different version of 
same images, specifically rotation of a image approximately 
4°C in left and right direction also translated version of 
original image. In addition to that images are prepossed using 
contrast adjustment method, both the original and contrast 
adjusted images are used for training. Total number of images 
per category is shown in Table. 12]. 

TABLE II 


Training set 

Validation set 

Test set 

300 

150 

100 


Before running main algorithm on these image we will 
resize our image to 90 x 97. For testing purpose we will 
use affine transformation of same image to produce different 
version of it for getting rid of viewpoint variation. plSA is an 
iterative algorithm, for convergence of parameters we have 


TABLE III 


Algorithm 

Accuracy (%) 

Committe of CNNs 

99.46 

Human Performance 

98.84 

Multi-Scale CNNs 

98.31 

pLSA 

98.14 

Random Forest 

96.14 

LDA on HOG2 

95.68 

LDA on HOGl 

93.18 

LDA on HOG3 

92.34 


used 182 iterations. In Fig. 17] we have shown log-likelihood 
variation with respect to number of iteration. 

First these image are classified on the basis of their shape 
by using HOG features. Once we have the prior information 
about the shape, we have applied our classification algorithm 
which is based on sift features. All testing image will be 
pre-processed and converted to combination of visual words 
using FFC codebook formation method as described in 
earlier section for training images. Our method final result 
is very promising. A comparison with different state of 
art methods is shown in Table. 13]. Also we will compare 
accuracy of some sub classes, like speed limits, other pro¬ 
hibitions, derestrictions, danger, mandatory and priority. We 
have almost very good classification result across sub classes. 
In Table. 11] a comparison of results obtained by different 
methods is explored. 

The algorithm is implemented in a MATFAB platform 
also with some rappers. All the testing was carried out 
in a quad core intel i7 Finux Machine. 

Performance of our algorithm vary with numbers of topic 
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Fig. 6. Accuracy of Speed Limits category over TOPICS 



Fig. 7. Log-likelihood over iterations 


used in classification. In Fig. [6] we have shown a com¬ 
parison of accuracy over topics for speed limits subclass. 
Similar accuracy variation over topics is obtained for other 
subcategory. We have maximum accuracy over 29-44 number 
of topics. We have chosen 35 number of topics as optimal 
for classification of our system. 

V. Conclusions 

In this work we have developed a novel and fast ap¬ 
proach for traffic sign recognition problems. Because of little 
variability between different same shaped sign it is very 
difficult problems in classification. We exploit the concept 
of representation of images as collection of words and 
topic, by using the visual codebook formation concepts. For 
computation of probability of topics in images we have used 
probabilistic latent semantic analysis. Our main contribution 
is using novel pLSA model for getting near state of art 
performance in classification problems and developing a 
shape based classifier. This system is independent of colour. 
Hardware implementation of this approach is much easier 
than other SVM, neural network and deep learning based 
approach. 
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